New ht thcovmat#2126
Conversation
|
Greetings from your nice fit 🤖 !
Check the report carefully, and please buy me a ☕ , or better, a GPU 😉! |
28662e8 to
6b71fae
Compare
d75f1b9 to
8e1dc74
Compare
348be41 to
9ac473c
Compare
|
Hi @roy, I've updated the code so that now the covmat for power corrections can be constructed as in the case of scale variations. Please, look and let me know if something is unclear. I'd like if you could double-check the functions that compute the shifts, in particular when normalisation factors and conversion factors are used. There are few things that I still don't like here and there, but you're free to propose modifications. |
9ac473c to
84642cb
Compare
|
Greetings from your nice fit 🤖 !
Check the report carefully, and please buy me a ☕ , or better, a GPU 😉! |
39f5640 to
ef59af1
Compare
fdab4c9 to
9f286c6
Compare
3c8b923 to
de6497e
Compare
a7e265a to
49bca4f
Compare
scarlehoff
left a comment
There was a problem hiding this comment.
Hi @achiefa thanks for this.
I've left a few comments on the code. The 1000line higher_twist_functions I haven't look into in much detail, but one thing that I'd like to mention is please avoid using commondata_table. This is basically an object containing all commondata information like in the old days which was necessary to add for some of validphys functions but the idea is to instead get only the information you want.
For instance, the process type is part of the metadata now (no need to load the data and uncertainties!)
Same for the kinematics (although by the time you need the kinematics you often have already read the whole commondata). You can get the kinematics directly with cd.metadata.load_kinematics() and then you already have x, Q, and y for DIS as god intended!
32314ea to
d47da56
Compare
d47da56 to
f709249
Compare
0dd7e0a to
aa7c97b
Compare
ea9411e to
f033542
Compare
|
@scarlehoff @RoyStegeman Given that we agreed in Amsterdam to use power corrections in the studies for 4.1, I believe this PR must be reviewed and possibly merged into master. I've polished the code and added a doc page that explains what power corrections are in the context of NNPDF and how they can be included in the fit. Let me know if that is enough. |
scarlehoff
left a comment
There was a problem hiding this comment.
Thanks. We can talk on Tuesday to have someone go through it in more detail.
I've just looked at the changes very quickly and left a few comments, but since you have been rebasing this can probably be merged relatively quickly.
Btw, I think it might be a good idea to add a trimmed down version of the runcard you have added to the regression tests folder (in the root), without the "9 point" only with the power corrections.
That way we'll be testing at least partially the theory covmat part (which we are not doing right now to avoid downloading too many fktables).
| """ | ||
| # Check that `deltas1` and `deltas2` have the same shifts | ||
| if deltas1.keys() != deltas2.keys(): | ||
| raise RuntimeError('The two dictionaries do not contain the same shifts.') |
There was a problem hiding this comment.
Since you already added a check, this can be checked beforehand as well? Or is it not possible?
There was a problem hiding this comment.
Where did I add a check?
Do you mean the check check_pc_parameters? I'm not sure that the check you pointed out can be included in check_pc_parameters. Actually, I think the check above is a bit superfluous.
There was a problem hiding this comment.
In the checks.py there's a new check_pc_parameters. You can also add a check for the deltas.
There was a problem hiding this comment.
I think here there is a more subtle problem. compute_covs_pt_prescrip expects deltas1 and deltas2 to be lists of np.ndarrays, while here I'm using dicts. I think thist must be changed.
7178843 to
50c2fba
Compare
|
In the end, have the theories on the storage server been updated with the k-factors? |
|
Hi @RoyStegeman, what do you mean? Do you mean the non-perturbative c-factors provided by the experimentalists? We agreed in Amsterdam not to consider them, as the information provided by the experimentalists is, most of the time, sparse, lacking, or missing. For this reason, we will use the PC analysis as a diagnostic tool for the data selection criteria. We also agreed to consider dijet data only. If corrections turn out to be compatible with zero, then these datasets pass the selection criteria and possibly we will add the PC uncertainties as an additional covmat. If corrections are compatible with the experimental uncertainties, the dataset will be discarded, and we reassess the single-jet data. Anyway, the experimental NP corrections have now been added to theory_slim. See this PR. If instead you meant the k-factors for the PC the we determined in our analysis, then the answer is no. |
50c2fba to
9616c40
Compare
Indeed, this is what I meant. But from your message I understand that if they significantly deviate from 1, then the data will not be included so there is no need for the k-factors anyway |
964badf to
fa0ba36
Compare
scarlehoff
left a comment
There was a problem hiding this comment.
Left a few comments but haven't run the code yet. I've also look at higher_twist_functions very superficially for now, just got curious about the cuts.
I'm a bit worried about all the if conditions that this one prescription adds and whether it breaks some assumptions in the code. In particular the one where theoryid(s) is a one-item list...
scarlehoff
left a comment
There was a problem hiding this comment.
I tried to run the exmaple runcard and got:
~ vp-setupfit Basic_runcard_pc_covmat.yml
ValueError: Shape of passed values is (3148, 3148), indices imply (3418, 3418)
This does look like a problem with the cuts (with the funny coincidence of 14 -> 41)
Also, a lot of errors like this one before crashing:
[INFO]: Summary: 3238/3850 datapoints passed kinematic cuts.
/Volumes/csDisk/Repositories/NNPDF/nnpdf/validphys2/src/validphys/covmats_utils.py:88: RuntimeWarning: divide by zero encountered in matmul
return np.diag(diagonal) + corr_sys_mat @ corr_sys_mat.T
For the rest, I've had a look at the higher_twist_functions.py module and I've flagged a few things. It's probably not comprehensive enough, it is very long 😅 but anyway it can probably be merged once the issue above gets solved.
| ) | ||
| cond_high = np.multiply( | ||
| a >= bin_mid, a < bin_high if shift_pos != len(y_shift) - 1 else a <= bin_high | ||
| ) |
There was a problem hiding this comment.
I'm trying to wrap my head around these two statements but I'm a bit confused. I think you cannot get a True * True as long as shift_pos != len(y_shift) - 1
But if not shift_pos != len(y_shift) - 1 then you can get True in both if a == bin_mid (because the second one is <= while I think it should be just <?
But maybe I'm understanding this wrong.
(also, why instead of multiplying Trues and Falses don't you use an & here?)
There was a problem hiding this comment.
Tbg, I wrote this function almost two years ago and I detested it ever since. I'm pretty sure I can come up with a shorter version that does exactly the same thing.
There was a problem hiding this comment.
Something like this looks much better and cleaner.
def linear_bin_function_proposed(
a: npt.ArrayLike, y_shift: npt.ArrayLike, nodes: npt.ArrayLike
) -> np.ndarray:
a = np.asarray(a, dtype=float)
res = np.interp(a, nodes, y_shift)
res[(a < nodes[0]) | (a > nodes[-1])] = 0.0
return resThere was a problem hiding this comment.
I'm not that worried about the complexity of the function or the length, I'm just not convinced it is correct ^^U
|
Hi @scarlehoff, thanks for your comments and suggestions. I went through them all and I'll push the new fixes. I also fixed the basic runcard: now it works without crashing. I just don't understand how to reproduce this warning which btw is the same as #2455. |
|
If you are not getting the warning perhaps it is an issue with my version of numpy... |
I'm not able to reproduce this warning. |
|
Indeed, it seems it was my version of numpy. |
scarlehoff
left a comment
There was a problem hiding this comment.
Thanks, I'm not going to go through the whole of higher_twist_functions.py again (also because I think you played god with git's history so it is even more difficult to know what is the different with respect to the previous version) but I'll trust the changes are basically those suggested 😅
vp-setupfit at least now works
| log.info(f'{luxset} Lux pdf checked.') | ||
| luxset = fiatlux['luxset'] | ||
| luxset.load() | ||
| log.info(f'{luxset} Lux pdf checked.') |
There was a problem hiding this comment.
I'm a bit worried about these files having been touched. They were not in the previous version I reviewed were there?
This PR allows for the inclusion of theory uncertainties due to the effect of power corrections. The theory covariance matrix is constructed by computing the shifts for the theoretical predictions as done for the MHOUs. The shift is computed at the level of the structure functions. Then, the shifts for the structure functions are combined to reconstruct the shift for the xsec. For this reason, the calculation of the shift depends on the dataset. Currently, only 1-JET and DIS (NC and CC) are supported.
At the level of the runcard, power corrections are specified as follows
For each process/sf implemented, we need to specify a series of information:
ht. This is useful to identify the type of power correction, in particular when computing the posterior.yshiftare the magnitudes of the prior.nodescontains the points where the prior is shiftedThe array
pc_included_procsspecifies the processes for which the shifts are computed. I've also implemented the possibility to exclude some particular datasets within the selected processes, and this can be done by specifying the names inpc_excluded_exps.The keyfunc_typeis temporary and will be deleted once we decide which function to use to construct the prior.All the relevant details for this PR are contained in the module
higher_twist_functions.py. For each observable for which the shift has to be computed, I implemented a factory that constructs a function which will then compute the shift. I thought it was kind of necessary to make the shift dependent on the shift parameters (yshiftandnodes) and on the prescription according to which we vary the parameters (which for now is fixed), and not on the kinematics (I think this is known as currying in computer science).TO DO
[x] Removefunc_typepower_correctionsis specified but the parameters are not given?httoname(?)